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Abstract:  Prior  to  committing  personnel  to  investigate  a 
building  or  suspicious  site  such  as  a  cave ,  it  is 
imperative  to  determine  the  importance  and  current 
danger  of  the  site .  To  this  end,  sensors  on  a  robotic 
platform  can  interrogate  the  site  prior  to  sending  in 
personnel .  This  paper  investigates  methods  to  exploit 
multiple  sensor  modalities  in  order  to  automatically  1) 
detect  human  presence,  and  2)  detect  human 
infrastructure  and  recent  human  activity .  The  paper 
describes  10  experimental  scenarios  to  support  these  two 
tasks,  demonstrates  what  type  of  inference  each 
modality  can  make,  and  shows  how  to  fuse  the 
information  from  all  sensors .  Experimental  results  are 
also  provided  for  the  detection  of  the  presence  of 
humans. 

Keywords:  Sensor  fusion,  Spatial-temporal  processing, 
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1  Introduction 

During  security  sweeps,  it  is  essential  that  the  scout  is 
able  to  determine  whether  or  not  a  building  is  occupied, 
and  whether  an  unoccupied  building  has  accommodated 
recent  human  activity  or  is  simply  abandoned.  Such 
situational  awareness  is  essential  for  scouts  to  safely  enter 
buildings  relevant  to  their  mission.  Similarly,  scouts  may 
need  to  gather  intelligence,  surveillance  and 
reconnaissance  (ISR)  information  about  tactically 
important  sites  such  as  caves,  tunnels,  and  other  hard  to 
reach  locations.  The  scout  must  understand  if  it  actually 
includes  human  infrastmcture  such  as  electrical  wiring, 
man-made  vents,  presence  of  electrical  utilities, 
generators,  cooking  utensils,  etc.  If  the  site  does  support 
human  activity,  the  scout  then  must  know  if  the  site  is 
presently  occupied,  recently  used,  or  abandoned. 

The  ability  for  the  scout  to  obtain  information  about 
human  presence  or  recent  human  activity  via  the  use  of 
mobile  sensors  would  be  most  advantageous.  This  paper 
discusses  possible  multi-sensor  solutions  for  the  automatic 
detection  of  human  presence  and  recent  human  activity. 
The  technology  to  detect  the  presence  of  humans  is  much 
more  mature  than  the  technology  to  detect  recent  human 
activity  after  the  people  have  vacated  the  area.  For 
instance,  researchers  are  developing  sensors  systems  that 


detect  footfalls  (or  gait)  [1,  2],  speech,  the  spectral 
response  of  human  skin,  etc  [3].  Little  work  has  focused 
on  the  detection  of  human  infrastmcture  in  remote  sites 
and  the  indirect  detection  human  activities.  Fortunately, 
when  people  perform  activities,  they  leave  behind  many 
clues  that  can  be  exploited  by  forensic  sensor  systems.  For 
instance,  if  the  people  used  any  machinery,  the  machine 
could  still  be  warm.  It  is  possible  that  the  concentration 
levels  of  human  pheromones  in  a  room  may  reveal  the 
prior  presence  of  people. 

This  paper  is  organized  as  follows.  Section  2  lists  the 
different  modalities  that  are  being  considered,  and 
Section  3  lists  different  data  collection  scenarios  that  have 
been  executed  to  test  multi- sensor  human  presence  and/or 
human  activity  detection.  Section  4  details  a  fusion 
experiment  for  the  detection  of  human  presence,  and 
Section  5  discusses  a  proposed  approach  for  human 
infrastmcture  and  activity  detection.  Finally,  Section  6 
concludes  the  paper  and  discusses  further  research. 

2  Sensors 

In  order  to  detect  human  infrastmcture  and  activity, 
several  common  sensor  modalities  are  considered. 
Because  mission  requirements  change,  these  sensors 
cannot  be  deployed  at  fixed  locations.  Rather,  they  must 
fit  on  a  mobile  platform  so  they  can  travel  inside  the 
building  or  other  tactical  sites.  As  a  result,  the  form-factor 
of  the  sensors  must  be  small  enough  to  fit  on  a  robotic 
platform.  Figure  1  shows  a  model  of  a  prototype  robotic 
system  that  includes  the  requisite  suite  of  sensors1. 
Sensors  that  can  meet  the  detection  functionality  and  size 
requirements  are  listed  below.  Figure  2  provides  pictures 
of  many  of  these  sensors  and  the  descriptions  of  these 
sensors  are  given  below. 

•  Acoustic  sensors  used  are  the  piezo  electric 
microphones  and  can  be  used  to  detect  speech,  sounds 
generated  by  machinery,  etc. 

•  Seismic  sensors  are  3 -axis  sensors  that  can  detect  the 
vibrations  in  the  ground.  They  are  used  to  detect 
footfalls,  vibrations  caused  by  machines  being  operated, 


1  The  actual  robotic  prototype  will  be  available  before 
publication  of  the  final  version  of  this  paper. 
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Figure  1:  Mobile  platform  (Packbot)  with  mockup  model 
of  senor  packages. 

etc.  Accelerometers  can  also  detect  vibrations  in  pipes 
that  are  produced  by  the  flow  of  water. 

•  RF  detectors  can  detect  any  RF  activity  such  as  the  use 
of  cell  phones. 

•  Magnetic  (B-field)  sensors  can  be  used  to  detect 
ferromagnetic  materials  carried  by  people,  e.g.,  keys, 
firearms,  and  knives.  These  sensors  may  also  detect  the 
usage  of  computer  monitors. 

•  Electrostatic  (E-field)  sensors  can  be  used  to  detect  the 
built-up  electric  charge  on  personnel.  Together  with 
magnetic  sensors,  they  can  also  detect  electrical  activity 
in  the  vicinity  such  as  the  usage  of  computer  keyboards. 

•  Chemical  sensors  can  be  used  to  detect  the  presence  of 
different  kinds  of  chemicals  in  the  atmosphere  such  as 
pheromones  and  household  chemical  vapors. 

•  Passive  infrared  devices  are  very  inexpensive  sensors 
that  detect  the  nearby  presence  of  a  warm  body,  e.g.,  a 
human,  within  a  cone  shaped  field  of  view. 

•  Visible  imagers  can  capture  color  or  grayscale  video 
for  human  gait  detection  and  object  recognition. 

•  Infrared  imagers  can  detect  and  localize  hot  bodies 
and  warm  surfaces,  including  the  vents  in  tunnels.  They 
can  also  provide  thermal  profiling  of  buildings,  where 
warmer  rooms  are  indicative  of  current  or  recent  human 
inhabitation. 

•  Micro  Radars  can  detect  and  track  people  in  short 
ranges.  Low  frequency  radars  can  even  see  through 
walls. 

3  Data  Collections 

Multi-sensor  data  was  collected  for  a  number  of  different 
scenarios.  Most  of  the  data  collection  occurred  in  a 
remote  building  that  contains  some  machinery.  For  most 
scenarios  the  following  sensors  collected  data:  visible 


Radar  chemical  sensor 

Figure  2:  Acoustic  (piezo  electric  microphone), 

Seismic  (accelerometer),  Passive  Infrared  (motion 
detector),  small  Radar  (2.5W,  5.8  GHz  Radar),  E- 
Field  (Quasar  3 -axis),  Forward  looking  infrared 
camera,  10  compound  chemical  sensors  for  human 
activity  detection. 

camera,  infrared  camera,  magnetic,  electrostatic,  acoustic, 
seismic,  and  chemical.  Some  scenarios  were  designed  to 
evaluate  multi-sensor  systems  for  detection  of  human 
presence,  and  the  other  scenarios  were  designed  for 
sensing  prior  human  activity. 

The  scenarios  to  evaluate  direct  human  presence  detection 
include: 

•  Corridor  Scenario:  The  suite  of  sensors  is  placed  at 
the  center  of  a  hallway.  A  persons  walks  down  the 
hallway.  The  goal  is  to  determine  the  range  from  the 
sensors  at  which  the  person  is  detected. 

•  Human  Walking  and  Talking  Scenario:  The 
sensors  are  observing  people  walking  and  talking  in  a 
room.  The  goal  is  to  determine  how  many  modalities 
can  detect  the  standard  human  activities. 

Other  experiments  are  designed  to  indirectly  detect 
humans  by  detecting  signals  that  humans  create  while 
using  machinery.  These  scenarios  include: 

•  Cell  Phone  Scenario:  Sensors  are  observing  the 
ringing  and  usage  of  a  cell.  Cell  phones  are  very 
prevalent  nowadays,  especially  in  the  third  world 
countries  where  the  wired-telephone  infrastructure  is 
rather  limited.  The  goal  is  to  detect  their  usage  using 
multiple  modalities,  such  as  RF  detectors  and 
acoustic  sensors. 


•  Bathroom  Scenario:  A  person  flushes  a  toilet,  and 
sensors  are  located  in-situ  and  remotely  to  detect  the 
flushing  event.  The  goal  is  to  determine  which 
sensors  can  remotely  detect  the  water  flow  through 
the  pipes  as  a  result  of  the  flushing  event.  For 
instance,  an  accelerometer  attached  to  the  pipes  far 
away  from  the  bathroom  should  detect  the  event. 
Also,  the  opening  of  the  bathroom  door  can  be 
detected  by  magnetic,  seismic,  electrostatic,  and  both 
IR  and  visible  cameras. 

•  Computer  Keyboard  Scenario:  The  goal  is  to  detect 
the  usage  of  the  keypad  using  several  sensor 
modalities.  In  this  information  age,  computer  keypads 
are  used  for  a  very  large  number  of  applications 
including  planning,  information  downloads, 
communications,  etc. 

•  Computer  Monitor  Scenario:  The  sensors  are 
observing  the  usage  of  a  computer  monitor:  The  goal 
is  similar  to  that  of  the  keypad  usage. 

The  final  class  of  scenarios  is  used  to  determine  the 
feasibility  of  sensors  to  determine  either  current  or  prior 
human  activities.  Sensors,  for  example,  could  detect 
signals  radiating  from  residual  materials  and  energy 
directly  due  to  human  activity  or  due  to  human 
infrastructure  to  support  the  activity.  The  scenarios 
include: 

•  Machine  Shop  Scenario:  Sensors  observed  a  drill 
press  in  a  secluded  building.  The  press  was  used  to 
drill  a  bore  in  a  wooden  plank.  The  goal  here  is  to  find 
how  many  sensor  modalities  can  detect  the  machine 
while  in-use  and  determine  how  long  after  the  machine 
is  turned  off  that  the  residual  information  can  signify 
prior  usage. 

•  Conference  Room  Scenario:  In  this  scenario,  people 
sat  around  a  table  and  talked  to  each  other.  Some  of 
them  were  smoking  cigars  and  some  were  drinking 
coffee.  After  a  period  of  time,  they  left  the  conference 
room,  leaving  behind  burning  cigars  and  unfinished 
coffee.  The  sensors  observe  the  room  after  the  people 
leave.  The  goal  for  the  sensor  is  to  determine  that  the 
conference  room  was  recently  used  by  some  people 
due  to  the  warm  seats  and  chemical  scents  left  behind. 
It  is  also  important  to  determine  how  long  the  sensors 
can  continue  to  detect  prior  human  presence. 

•  Vent  Scenario:  A  cave  or  tunnel  that  is  currently 
supporting  human  activities  will  require  vents  to 
circulate  in  fresh  air.  The  goal  of  this  scenario  is  to 
determine  which  modalities  can  distinguish  man-made 
air  circulation  from  natural  air  movement,  e.g.,  wind. 

•  Portable  Generator  Scenario:  In  a  cave  or  a  tunnel, 
it  is  most  likely  that  a  portable  generator  will  be  used. 
The  goal  is  to  detect  this  man-made  object,  both  while 
it  is  being  used  and  a  few  hours  after  its  operation. 


4  Detection  of  Human  Presence 

The  detection  of  personnel  may  be  accomplished  either  by 
directly  detecting  the  person  or  by  indirectly  detecting  the 
actions  or  objects  associated  to  a  human  being.  Direct 
means  of  detecting  personnel  include  the  usage  of 
chemical,  electrostatic,  passive  infrared  (PIR),  and 
imagers  (visible  and  infrared).  For  instance,  the  chemical 
sensor  is  used  to  detect  human  pheromone  by  producing 
appropriate  sensing  outputs.  Algorithm  development  for 
the  chemical  sensor  is  still  in  progress.  Electrostatic 
sensors  detect  changes  to  the  ambient  electric  field  caused 
by  static  charges  on  the  human  skin.  The  output  of  the 
electrostatic  sensor  produces  a  detectable  signal  when  a 
person  is  walking  near  the  sensor  (see  Figure  3),  and  a 
simple  threshold  detector  can  detect  the  presence  of  a 
body.  The  PIR  generates  an  output  that  is  proportional  to 
the  body  temperature  of  a  person.  A  simple  threshold 
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Figure  3:  Different  sensor  outputs  in  Bathroom  scenario 

above  the  ambient  noise  would  detect  the  presence  of  a 
hot  body  within  the  vicinity  of  the  PIR  sensor.  Imagers 
can  distinguish  the  silhouette  of  the  human  being  when 
there  is  sufficient  contrast  from  the  background. 
Furthermore,  it  can  be  possible  to  segment  human  skin 
from  an  image  based  upon  color  [4].  Finally,  when  the 
human  walks,  the  change  of  the  human’s  silhouette  due  to 
his/her  gait  produces  a  unique  signature  [13].  Indirect 
means  of  detecting  personnel  include  the  usage  of 
acoustic,  seismic,  magnetic,  passive  infrared  (PIR),  and 
chemical  information  collected  through  the  respective 
sensors.  Acoustic  sensors  can  capture  human  speech,  and 
one  can  exploit  speech  processing  algorithms  to  determine 
whether  or  not  human  speech  can  be  extracted  from  the 
background  noise.  In  order  to  detect  the  presence  of 
people,  the  acoustic  signal  spectrum  between  50  Hz  to 
2000  Hz  is  analyzed.  An  algorithm  [1]  has  been 
developed  to  detect  personnel  based  the  statistical  analysis 
of  the  energy  content  in  at  least  three  of  the  four  bands, 
where  each  band  is  roughly  500  Hz.  A  seismic  sensor 
detects  the  closing  of  the  door,  if  it  is  slammed  against  the 
frame.  It  also  detects  the  footfalls  of  a  walking  person. 


We  have  developed  an  algorithm  to  detect  the  gait 
frequency  of  humans  [1,2]  using  seismic  sensor  data.  The 
typical  gait  frequency  lies  between  1.8  to  2.2  Hz.  If  these 
frequency  components  and  their  harmonics  are  present  in 
the  seismic  data,  then  it  is  likely  that  there  is  a  person 
present  in  the  neighborhood  of  the  sensor.  Figure  4  shows 
the  output  of  a  seismic  sensor.  In  the  figure,  the  signature 
of  the  footsteps  appears  as  a  spike  that  repeats  at  a 
characteristic  frequency.  A  magnetic  sensor  detects  the 
opening  and  closing  of  a  door  through  the  changes  in 
magnetic  flux.  If  a  person  carries  any  ferromagnetic 
material,  such  as  keys  or  short-guns,  the  magnetic  sensor 
also  generates  an  output  that  can  be  threshold  to  detect  the 
presence  of  such  a  material.  An  algorithm  for  tracking  the 
movement  of  ferromagnetic  material  [5]  can  be  used  as  an 
indirect  indication  of  the  presence  of  a  person. 
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Figure  4:  Foot  falls  identified  in  the  seismic  sensor  output 
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Figure  5:  Output  of  Acoustic,  PIR  and  Seismic  Sensors 


Figure  6:  Corridor  experiment  -  Ground  truth 


The  remainder  of  this  section  demonstrates  the  fusion  of 
acoustic,  PIR,  and  seismic  sensors  for  the  direct  detection 
of  humans  walking  through  a  hallway.  The  whole  fusion 
system  is  evaluated  over  data  collected  in  support  of  the 
Corridor  Scenario.  It  consists  of  determining  the 
likelihood  of  human  presence  via  the  signal  level  of  each 
sensor,  and  then,  combining  these  likelihoods  via 
Bayesian  fusion  to  obtain  the  posterior  probability  of 
human  presence  given  the  signal  levels  of  all  three 
sensors. 

Figure  5  shows  the  output  of  the  signal  levels  of  the  three 
sensors,  and  Figure  6  shows  the  ground  truth  location  of 
the  person  in  the  hall.  The  hall  is  x  meters  long  so  that  a 
location  of  0  and  X  means  that  the  person  is  located  at  one 
end  of  the  hall  or  the  other.  The  sensors  are  located  at  Y, 
which  is  near  the  center  of  the  hallway.  The  person  is 
walking  over  the  interval  between  70  and  130  seconds. 
The  acoustic  and  seismic  signals  indicate  footfall 
signatures  when  the  person  is  passing  close  to  the  sensors. 
Furthermore,  the  PIR  sensor  provides  a  bipolar  response 
when  the  person  passes  within  the  field  of  view.  The 
seismic  signal  also  includes  significant  background  noise. 


To  detect  people,  the  acoustic  and  seismic  data  is 
processed  to  form  spectral  and  gait  features,  respectively, 
as  described  in  [1].  For  the  PIR  data,  the  signal 
magnitude  forms  the  features.  Next,  the  distribution  of  the 
features  conditioned  on  the  different  hypotheses  is 
determined.  Specifically,  we  define  H()  and  Hx  as  the  null 
and  human  present  hypotheses.  The  likelihood  of  each 
hypothesis  is  defined  as  the  probability  of  the  observation, 
i.e.,  feature,  conditioned  on  the  hypothesis, 

lH,  (Xx)  =  P(Xs  \H,)  (1) 

for  /  =1,2  and  s  e  S,  where  S={ acoustic,  PIR,  seismic}. 
The  conditional  probability  is  modeled  as  a  Gaussian 
distribution, 

p(xs  I  Ht )  =  a U, ;  ps,t  >  °li  )•  ( 2 ) 

The  statistics  of  the  distribution  of  the  signal  data  for  a 
given  hypothesis  is  determined  by  using  the  sample  mean 
and  variance  of  training  data.  Let  xSJ  represent  the  time 
series  associated  to  the  s  sensor.  Then, 
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(5) 

where  /?(//0)  and p(Hx)  represent  the  prior  probabilities  for 
the  absence  and  presence  of  a  human,  respectively.  This 
paper  assumes  an  uninformative  prior,  i.e.,  p(Hq)= 
/?(//i)=0.5.  Figure  7  shows  the  posterior  probabilities  of 
the  three  sensors  as  function  of  time  for  the  corresponding 
signal  data  in  Figure  5.  The  closest  point  of  approach  of 
human  to  the  sensor  package  occurs  at  t  =  90  sec,  which 
corresponds  to  the  case  where  the  posterior  probability 
approaches  1  in  figure  7. 

The  fusion  of  the  sensors  can  easily  be  implemented  via 
Bayes  rule  by  making  the  reasonable  assumption  that  the 
sensor  data  for  different  modalities  are  statistically 
independent  when  conditioned  on  one  of  the  two 
hypotheses.  The  posterior  probability  of  human  presence 
given  data  from  all  three  sensors  is, 

P(Hx)Y\1h,  (*,) 

p(" 11  ^  p(ho)Y{ihSxs}+ p(H^ninSxS 

sgS  sgS 

(6) 

where  x  =  [xacoustiC,  xPiR,  xseiSmic]r  is  the  concatenation  of 
features  from  all  three  sensor  modalities.  Figure  7  also 
shows  the  posterior  probability  that  is  the  result  of  the 
data  fusion.  In  the  end,  a  detector  is  simple  declaring  a 
human  if  the  posterior  exceed  a  threshold.  Clearly,  the 
PIR  is  the  best  single  sensor  for  detecting  personnel.  The 
fusion  is  able  to  maintain  a  high  posterior  probability 
when  the  PIR  is  able  to  detect  the  human.  One  downside 
to  the  PIR  is  its  limited  field  of  view.  Fortunately,  the 
fusion  given  by  (6)  provides  the  advantage  of  detecting  a 
human  when  the  human  fails  to  cross  through  the  field  of 
view  of  the  PIR. 


The  performance  of  this  detection  system  can  be  enhanced 
by  considering  the  temporal  signature  of  the  target  due  to 
footfalls.  Furthermore,  imagers  can  be  used  to  collect  a 
feature  based  upon  gait.  Future  work  will  investigate  the 
performance  gains  by  using  better  features. 

5  Detection  of  Human  Activity  and 
Infrastructure 

This  section  discusses  human  activity  and  infrastructure 
detection.  A  number  of  scenarios  described  in  Section  3 
are  applicable.  This  section  will  focus  on  the  Machine 
Shop  scenario  to  illustrate  how  sensor  data  can  be  used  to 
distinguish  patterns  caused  by  human  activities  from  those 
caused  by  natural  phenomena.  The  scenario  covers  all 
aspects  of  the  human  activity  and  infrastructure  detection 
that  we  would  like  to  address. 

The  Machine  Shop  scenario  consists  of  data  collected 
from  sensors  observing  a  secluded  room  that  includes  a 
drill  press.  A  color  video  camera  and  a  long- wave  infrared 
(LWIR)  camera  were  aimed  at  the  drill  press  from  a 
distance  of  20  feet  with  similar  field-of-views.  An 
acoustic  sensor  (microphone),  a  chemical  sensor,  a 
seismic  sensor,  a  magnetic  sensor,  and  an  electrostatic 
sensor  were  placed  within  10  feet  of  the  drill  press.  An 
identical  suite  of  sensors  was  placed  outside  the  room.  An 
electrostatic  sensor  was  placed  near  an  electrical  power 
distribution  box  that  is  far  away  from  the  room.  During 
the  operation  of  the  drill  press,  the  door  to  the  room  was 
closed.  Prior  to  the  actual  experiment,  all  sensors  were 
allowed  to  collect  background  noise  for  about  3  minutes. 
Then  an  operator  opened  the  door,  went  into  the  secluded 
room,  closed  the  door,  and  walked  to  the  drill  press,  and 
turned  on  the  drill  press.  After  that,  the  operator  drilled  a 
wooden  plank  for  about  3  minutes  and  then  left  the  room. 
The  infrared  camera  was  on  for  another  three  hours  after 
the  drill  press  was  turned  off.  Some  of  the  sensors  used  in 
this  experiment  are  shown  in  Figure  2. 


Seismic  Sensor  Output  During  Drill  Press  Experiment 


Figure  8:  Seismic  sensor  output  when  the  drill  press  is 
turned  on  during  the  machine  shop  experiment. 


The  ultimate  goal  is  to  design  a  robotic  sensor  system  that 
can  roam  a  site  and  automatically  determine  that  the  site 
contains  man-made  equipment,  i.e.,  human  infrastructure, 
which  currently  support  (or  recently  supported)  human 
activity.  To  this  end,  the  sensors  must  monitor  the  site 
and  determine  if  the  output  signals  are  consistent  with  the 
usage  of  man-made  machinery  as  opposed  to  a  benign 
background. 

Various  sensors  can  be  used  to  detect  infrastructure.  In  the 
Machine  Shop  scenario,  magnetic  and  electrostatic 
sensors  can  easily  detect  when  the  drill  press  is  active. 
Successful  monitoring  is  accomplished  by  considering  the 
sensor  outputs  from  different  locations,  including  the 
sensor  suites  near  the  drill  press  and  outside  of  the 
secluded  room,  as  well  as  those  away  from  the  room  at  the 
electrical  distribution  box  of  the  building. 

Figure  9  shows  the  magnetic  and  electrostatic  sensor 
outputs  before  and  during  when  the  drill  press  is  turned 
on.  When  the  drill  press  is  off,  the  ambient  E  and  B  fields 
include  a  dominant  60Hz  harmonic  due  to  radiation  from 
outside  power  lines.  Furthermore,  the  60Hz  E  and  B  field 
harmonics  are  90  degrees  out  of  phase.  Because  the  power 
lines  are  not  in  the  vicinity,  the  60Hz  signal  is  noisy. 
When  the  drill  press  is  on,  the  higher  resulting  signal 
amplitudes  and  the  fact  that  the  phase  shift  between  the  E 
and  B  fields  are  no  longer  90  degrees  offer  clues  that 
some  sort  of  man-made  machine  is  currently  operating. 

Visible  cameras  provide  good  clues  about  the  presence  of 
the  man-made  object.  Pattern  recognition  technology  may 
some  day  allow  for  the  automatic  detection  and 
recognition  of  different  machines  for  imagery  collected  by 
visible  cameras.  For  the  near  term,  automatic  techniques 
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Figure  9  :  Output  of  E  and  B-Field  Sensors 

can  exploit  the  fact  that  man-made  objects  tend  to  be 
composed  of  canonical  geometric  shapes  that  have 
smooth  edges  and  sharp  comers.  On  the  other  hand, 
natural  object  tend  to  be  rough  and  jagged. 

We  are  currently  developing  algorithms  to  segment  out 
man-made  objects  from  a  visible  image  by  analyzing  the 
contour  of  the  objects.  First,  the  image  is  segmented 
using  techniques  based  on  edges  [7]  and  unified  regions 
[8].  Then,  the  contour  associated  to  each  segment  is 
analyzed.  Specifically,  features  such  as  fractal  dimension 
via  box  counting  [14]  and  curvature  [15]  will  be 
extracted.  The  features  will  be  used  to  develop  a  Bayesian 
man-made  object  detector  similar  to  how  amplitude 
features  are  used  in  the  human  presence  method  of 
Section  4. 

Let  Ho,0  and  Ho  X  represent  the  hypotheses  that  the  z'-th 
contour  is  natural  and  man-made,  respectively.  The  set  of 
features  for  the  z-th  contour  is  represented  by  the  vector  f,. 
Training  data  will  be  used  to  determine  the  likelihoods  of 
the  two  hypotheses  as 


lJfi)  =  p(il\HoSS),  lol(fi)  =  p(fl\HoA).  (7) 

The  likelihoods  may  be  modeled  by  Gaussians  as  in  (2)  or 
by  other  distributions  if  necessary.  Similar  to  (5),  the 
posterior  probability  for  HoX  is  computed  as 


P(Hol\  f,)  = 


IJfMH OJ) 

lJf,)p(H/jA)  +  lJi:)p(Ho(>) 


(8) 


The  man-made  object  that  leads  to  the  detected  contour 
could  be  abandoned.  It  is  crucial  to  determine  if  the 
contour  represents  a  recently  used  object.  When  a  man¬ 
made  machine  is  used,  heat  will  be  generated  at  the 
friction  points.  This  means  that  the  object  will  radiate  heat 
at  concentrated  location.  On  the  other  hand,  if  the  object 
has  been  turned  off  for  a  long  time,  the  only  heat  is 
created  by  solar  loading,  which  tends  to  distribute  the  heat 
evenly  over  the  object.  Thus,  the  distribution  of  heat  over 
an  object  contour  can  provide  inference  about  recent 
human  activity.  Furthermore,  the  decay  of  the  heat  over 
time  may  indicate  whether  or  not  the  heat  source  is  man¬ 
made  or  natural. 

The  heat  distribution  over  an  object  contour  requires 
registration  of  the  visible  and  infrared  imagery.  There  are 
a  number  of  difficulties  in  registering  the  outputs  from 
these  cameras  due  to  their  differences  in  focal  length, 
field-of-views,  lens  characteristics,  image  resolution,  as 
well  as  viewing  aspect  and  height,  We  are  developing 
algorithms  to  register  images  from  visible  and  IR  cameras 
based  on  geometric  transformations  and  stereo  techniques, 
see  [9],  [11],  [12]. 

Once  the  images  from  IR  and  visible  cameras  are  properly 
registered,  we  will  use  them  to  detect  certain  human 
activities.  In  the  case  of  the  machine  shop  scenario,  we 
may  detect  recent  human  activities  based  on  the  thermal 
footprint  of  the  drill  press,  even  when  the  drill  press  has 
been  idle  for  a  period  of  time.  Both  spatial  and  temporal 
features  describing  the  distribution  of  the  heat  may 
provide  inference.  For  instance,  we  plan  to  derive  spatial 
features  v  that  represents  the  spread  of  the  distribution 
over  the  interior  of  the  contour,  e.g.,  variance  or  entropy. 
We  also  plan  to  derive  a  temporal  feature  t  that  represents 
the  “average”  decay  of  heat  as  function  of  time  over  the 
interior  of  the  contour. 


4i  (v*  ti)  =p(Vi,  t i  |  Ha  1,  H0 1), 
4/0  (y i,  1 1)  —  p(\i,  1 1 1  Ha o,  Hol). 


Again,  the  likelihoods  may  be  modeled  as  Gaussians  or 
some  other  distribution  if  necessary.  The  likelihoods  for 
the  z-th  contour  conditioned  on  the  contour  features  are 
simply  the  likelihoods  defined  in  (9)  multiplied  by  the 
posterior  probability  that  the  contour  is  man-made.  The 
likelihoods  that  the  entire  scene  does  or  does  not  contain 
evidence  of  human  activity  assume  that  the  features  for 
each  contour  are  independent,  conditioned  on  the  activity 
hypothesis,  i.e., 

iJV,TW-t\.L<y,'*,)pW«  U). 

“  (>») 
lJV,T\F)^Y\la^„t,)p(Hc,\l), 

i= 1 


where  Nc  is  the  number  of  contours  in  the  scene. 


The  human  activity  likelihood  due  to  imagers  will  be 
combined  with  similar  likelihoods  computed  from 
acoustic  and  seismic  sensors.  As  in  Section  4,  the  direct 
presence  of  humans  will  be  obtained  by  spectral  S  and 
gait  features  G  for  acoustic  and  seismic  sensors, 
respectively,  and  the  likelihoods  for  the  Ha0  and  HaX 
hypotheses  are  derived  by  (1).  Other  features  will  be 
derived  for  the  acoustic  and  seismic  sensors  to  pick  up 
60Hz  harmonics  due  to  the  machinery.  Let’s  label  these 
machinery  features  as  Ha  and  Hs  for  the  acoustic  and 
seismic  sensors  respectively.  Given  that  all  sensor 
features  are  statistically  independent  when  conditioned  on 
either  hypotheses,  then  the  likelihoods  for  the  two 
hypotheses  using  all  features  is 

lal(X)  =  IJV,T  |  F)lJS)lJHa)lJG)lJHs), 
L(X)  =  IJV,T  I  F)lJS)lJHJlJG)lJHs), 

(ll) 


where  X  is  the  concatenation  of  all  the  features.  Finally, 
the  posterior  probability  that  the  scene  contains  human 
activity  is 


P{Hal\X)  = 


lal(X\F)p(Hal) 

lal(X\F)p(Hal)  +  Ia0(X\F)p(Ha0) 


Once  the  features  are  defined,  the  activity  detector 
consists  of  the  calculation  of  the  posterior  probability  of 
the  human  activity  hypothesis  HaX  over  the  entire  scene 
given  the  activity  and  contour  features  for  the  entire 
scene.  To  this  end,  training  data  will  be  used  to  derive  the 
likelihoods  of  the  no  activity  Hafi  and  human  activity  HaX 
hypotheses  for  each  of  the  Nc  contours.  The  likelihoods 
associated  to  the  z-th  contour  are  derived  only  when  the 
object  under  analysis  is  man-made  so  that  the  likelihoods 


(12) 

Figure  10  provides  a  flow  graph  to  illustrate  the  multi¬ 
sensor  processing  to  calculate  the  posterior  probability 
given  in  (12).  The  detector  declares  the  existence  of 
human  activity  if  the  posterior  probability  in  (12)  exceeds 
a  threshold.  Future  work  will  develop  the  modules  in 
Figure  10,  and  evaluate  the  receiving  operating 
characteristic  (ROC)  curves  associated  to  the 
corresponding  detector. 
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Figure  10:  Fusion  for  human  activity  detection 


6  Conclusion 

We  presented  schemes  to  directly  detect  the  presence  of  a 
human  or  to  detect  human  activity  and  infrastructure. 
Both  schemes  take  advantage  of  multiple  sensor 
modalities  through  the  use  of  Bayesian  fusion. 
Experimental  results  demonstrate  the  utility  of  the  fusion 
for  human  presence  detection.  The  method  can  be  further 
improved  by  incorporating  video  data  and  accumulating 
evidence  temporally.  Future  work  will  center  around  the 
development  of  the  modules  consisting  of  the  human 
activity  detection  scheme  and  enhancing  the  human 
presence  detection  scheme.  Once  both  schemes  are  fully 
developed,  they  can  be  used  to  determine  the  threat  level 
that  exists  in  urban  terrains,  tunnels,  caves  and  other 
remote  sites. 
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